AITopics | user ask

Collaborating Authors

user ask

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DocLens : A Tool-Augmented Multi-Agent Framework for Long Visual Document Understanding

Zhu, Dawei, Meng, Rui, Chen, Jiefeng, Li, Sujian, Pfister, Tomas, Yoon, Jinsung

arXiv.org Artificial IntelligenceNov-17-2025

Comprehending long visual documents, where information is distributed across extensive pages of text and visual elements, is a critical but challenging task for modern Vision-Language Models (VLMs). Existing approaches falter on a fundamental challenge: evidence localization. They struggle to retrieve relevant pages and overlook fine-grained details within visual elements, leading to limited performance and model hallucination. To address this, we propose DocLens, a tool-augmented multi-agent framework that effectively ``zooms in'' on evidence like a lens. It first navigates from the full document to specific visual elements on relevant pages, then employs a sampling-adjudication mechanism to generate a single, reliable answer. Paired with Gemini-2.5-Pro, DocLens achieves state-of-the-art performance on MMLongBench-Doc and FinRAGBench-V, surpassing even human experts. The framework's superiority is particularly evident on vision-centric and unanswerable queries, demonstrating the power of its enhanced localization capabilities.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2511.11552

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always!

Lei, Jingdi, Gumma, Varun, Bhardwaj, Rishabh, Lim, Seok Min, Li, Chuan, Zadeh, Amir, Poria, Soujanya

arXiv.org Artificial IntelligenceOct-6-2025

Large Language Model (LLM) safety is one of the most pressing challenges for enabling wide-scale deployment. While most studies and global discussions focus on generic harms, such as models assisting users in harming themselves or others, enterprises face a more fundamental concern: whether LLM-based agents are safe for their intended use case. To address this, we introduce operational safety, defined as an LLM's ability to appropriately accept or refuse user queries when tasked with a specific purpose. We further propose OffTopicEval, an evaluation suite and benchmark for measuring operational safety both in general and within specific agentic use cases. Our evaluations on six model families comprising 20 open-weight LLMs reveal that while performance varies across models, all of them remain highly operationally unsafe. Even the strongest models - Qwen-3 (235B) with 77.77% and Mistral (24B) with 79.96% - fall far short of reliable operational safety, while GPT models plateau in the 62-73% range, Phi achieves only mid-level scores (48-70%), and Gemma and Llama-3 collapse to 39.53% and 23.84%, respectively. While operational safety is a core model alignment issue, to suppress these failures, we propose prompt-based steering methods: query grounding (Q-ground) and system-prompt grounding (P-ground), which substantially improve OOD refusal. Q-ground provides consistent gains of up to 23%, while P-ground delivers even larger boosts, raising Llama-3.3 (70B) by 41% and Qwen-3 (30B) by 27%. These results highlight both the urgent need for operational safety interventions and the promise of prompt-based steering as a first step toward more reliable LLM-based agents.

large language model, machine learning, user ask, (20 more...)

arXiv.org Artificial Intelligence

2509.26495

Country:

Asia (0.93)
North America > United States (0.67)

Genre: Research Report > New Finding (0.67)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Law (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Red Teaming for Generative AI, Report on a Copyright-Focused Exercise Completed in an Academic Medical Center

Wen, James, Nalawade, Sahil, Liang, Zhiwei, Bielick, Catherine, Boston, Marisa Ferrara, Chowdhury, Alexander, Collin, Adele, De Angelis, Luigi, Ellen, Jacob, Frase, Heather, Gameiro, Rodrigo R., Gutierrez, Juan Manuel, Kadam, Pooja, Keceli, Murat, Krishnamurthy, Srikanth, Kwok, Anne, Lu, Yanan Lance, Mattie, Heather, McCoy, Liam G., Miller, Katherine, Morgan, Allison C., Moerig, Marlene Louisa, Nguyen, Trang, Owen-Post, Alexander, Ruiz, Alex D., Puchala, Sreekar Reddy, Samineni, Soujanya, Tohyama, Takeshi, Ullanat, Varun, Valenza, Carmine, Velez, Camilo, Wang, Pengcheng, Wuest, Anna, Zhou, Yuxiang, Zhu, Yingde, Johnson, Jason M., Lenane, Naomi, Willcox, Jennifer, Vitiello, Francis J., Celi, Leo Anthony G., Umeton, Renato

arXiv.org Artificial IntelligenceJul-4-2025

Background: Generative artificial intelligence (AI) deployment in academic medical settings raises copyright compliance concerns. Dana-Farber Cancer Institute implemented GPT4DFCI, an internal generative AI tool utilizing OpenAI models, that is approved for enterprise use in research and operations. Given (1) the exceptionally broad adoption of the tool in our organization, (2) our research mission, and (3) the shared responsibility model required to benefit from Customer Copyright Commitment in Azure OpenAI Service products, we deemed rigorous copyright compliance testing necessary. Case Description: We conducted a structured red teaming exercise in Nov. 2024, with 42 participants from academic, industry, and government institutions. Four teams attempted to extract copyrighted content from GPT4DFCI across four domains: literary works, news articles, scientific publications, and access-restricted clinical notes. Teams successfully extracted verbatim book dedications and near-exact passages through various strategies. News article extraction failed despite jailbreak attempts. Scientific article reproduction yielded only high-level summaries. Clinical note testing revealed appropriate privacy safeguards. Discussion: The successful extraction of literary content indicates potential copyrighted material presence in training data, necessitating inference-time filtering. Differential success rates across content types suggest varying protective mechanisms. The event led to implementation of a copyright-specific meta-prompt in GPT4DFCI; this mitigation has been in production since Jan. 2025. Conclusion: Systematic red teaming revealed specific vulnerabilities in generative AI copyright compliance, leading to concrete mitigation strategies. Academic medical institutions deploying generative AI should implement continuous testing protocols to ensure legal and ethical compliance.

gpt4dfci, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2506.22523

Country:

North America > Canada (0.68)
North America > United States > Massachusetts (0.47)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.69)
Health & Medicine > Therapeutic Area > Oncology (0.49)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding

Xu, Zhangchen, Liu, Yang, Yin, Yueqin, Zhou, Mingyuan, Poovendran, Radha

arXiv.org Artificial IntelligenceMar-4-2025

We introduce KodCode, a synthetic dataset that addresses the persistent challenge of acquiring high-quality, verifiable training data across diverse difficulties and domains for training Large Language Models for coding. Existing code-focused resources typically fail to ensure either the breadth of coverage (e.g., spanning simple coding tasks to advanced algorithmic problems) or verifiable correctness (e.g., unit tests). In contrast, KodCode comprises question-solution-test triplets that are systematically validated via a self-verification procedure. Our pipeline begins by synthesizing a broad range of coding questions, then generates solutions and test cases with additional attempts allocated to challenging problems. Finally, post-training data synthesis is done by rewriting questions into diverse formats and generating responses under a test-based reject sampling procedure from a reasoning model (DeepSeek R1). This pipeline yields a large-scale, robust and diverse coding dataset. KodCode is suitable for supervised fine-tuning and the paired unit tests also provide great potential for RL tuning. Fine-tuning experiments on coding benchmarks (HumanEval(+), MBPP(+), BigCodeBench, and LiveCodeBench) demonstrate that KodCode-tuned models achieve state-of-the-art performance, surpassing models like Qwen2.5-Coder-32B-Instruct and DeepSeek-R1-Distill-Llama-70B.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2503.02951

Country:

North America > United States > New York (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping

Wang, Haoyu, Ma, Guozheng, Meng, Ziqiao, Qin, Zeyu, Shen, Li, Zhang, Zhong, Wu, Bingzhe, Liu, Liu, Bian, Yatao, Xu, Tingyang, Wang, Xueqian, Zhao, Peilin

arXiv.org Artificial IntelligenceFeb-12-2024

Self-alignment is an effective way to reduce the cost of human annotation while ensuring promising model capability. However, most current methods complete the data collection and training steps in a single round, which may overlook the continuously improving ability of self-aligned models. This gives rise to a key query: What if we do multi-time bootstrapping self-alignment? Does this strategy enhance model performance or lead to rapid degradation? In this paper, our pioneering exploration delves into the impact of bootstrapping self-alignment on large language models. Our findings reveal that bootstrapping self-alignment markedly surpasses the single-round approach, by guaranteeing data diversity from in-context learning. To further exploit the capabilities of bootstrapping, we investigate and adjust the training order of data, which yields improved performance of the model. Drawing on these findings, we propose Step-On-Feet Tuning (SOFT) which leverages model's continuously enhanced few-shot ability to boost zero or one-shot performance. Based on easy-to-hard training recipe, we propose SOFT+ which further boost self-alignment's performance. Our experiments demonstrate the efficiency of SOFT (SOFT+) across various classification and generation tasks, highlighting the potential of bootstrapping self-alignment on continually enhancing model alignment performance.

iclexample, internal thought, reliable assistant, (12 more...)

arXiv.org Artificial Intelligence

2402.0761

Country:

North America > Canada (0.14)
Asia > China (0.05)
Europe > Spain (0.04)
(9 more...)

Genre:

Personal (1.00)
Research Report > New Finding (0.87)

Industry:

Leisure & Entertainment (1.00)
Law (1.00)
Health & Medicine > Consumer Health (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Google now lets users ask for images of minors to be removed from Search

EngadgetOct-27-2021, 11:06:34 GMT

Google has activated a safety feature that lets minors under 18 request that images of themselves be removed from search results, The Verge has reported. Google first announced the option back in August as part of a slate of new safety measures for kids, but it's now rolling out widely to users. Google said it will remove any images of minors "with the exception of case of compelling public interest or newsworthiness." The requests can be made by minors, their parents, guardians or other legal representatives. To do so, you'll need to supply the URLs you want removed, the name and age of the minor and the name of the person acting on their behalf.

google, newsworthiness, user ask, (4 more...)

Engadget

Technology:

Information Technology > Information Management > Search (0.45)
Information Technology > Sensing and Signal Processing > Image Processing (0.40)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition > Image Matching (0.40)

Add feedback

Design Framework for Chatbots – Chatbots Magazine

#artificialintelligenceJan-6-2018, 09:15:07 GMT

When I started designing chatbots for BEEVA almost a year ago, I applied some of my UX knowledge and did some unsuccessful research looking for tools that could fit my needs. Actually, I was quite amazed that I couldn't find practical literature about the topic. There are tons of chatbots out there, but there's little about how companies really get hands on. I already shared some of my findings here, and here, with tools I found, general knowledge about designing chatbots and UX design applied on chatbots, but I think it would be great to make a deeper explanation about how I exactly face the situation on a regular basis. While many people immediately start thinking about how to manage the user flow, I separate my process into 4 different steps: the bot scope, the chatbot personality, a prioritized list of must-have features and the chatbot flow.

artificial intelligence, chatbot, natural language, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)

Add feedback

Google assistant makes memes when users ask for them

Daily Mail - Science & techNov-10-2017

Have you ever had a great idea for a meme, but didn't feel going to the effort to make it yourself? If the answer is yes, then a new Google Assistant feature may help you out. Called'Meme Buddy,' the feature lets you make meme using your voice. By using the Google Assistant on your phone, you can ask meme buddy to find a photo and give it a caption - or ask it to make a completely random meme for you. To use Meme Buddy, open the Google Assistant on your phone and say'Talk to Meme Buddy.'

artificial intelligence, chatbot, natural language, (14 more...)

Daily Mail - Science & tech

Industry: Information Technology (0.32)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)

Add feedback

Amazon launches Spotify rival in the UK

The GuardianNov-14-2016, 11:20:03 GMT

Amazon is launching its Spotify rival, Amazon Music Unlimited, in the UK with a headline price of £9.99 a month and access to a library of "over 40 million" songs. Those figures are comparable to competitors like Spotify and Apple Music, but where Amazon is focusing its competition is on a pair of special offers, as well as the tight integration with its own hardware ecosystem. The service is cheaper for those with an Amazon Prime subscription, the company's all-you-can-eat free delivery package. The monthly cost drops by £2, to £7.99, and an annual payment plan is available for £79 a year, taking the equivalent monthly fee to £6.58. That's how Amazon is hoping to entice users to upgrade from its other free streaming service, Amazon Prime Music.

amazon, artificial intelligence, music unlimited, (10 more...)

The Guardian

Country: Europe > United Kingdom (0.62)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence (0.54)

Add feedback